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APPLICATION  OF  THE  STATISTICAL  METHOD  IN  THE  STUDY  OF  EPIDEMIO- 

LCCICAL  RELATIONS 


{Following  is  the  translation  of  an  article  by  M.  N.  Tkacheva, 
Gamaleya  Institute  of  Epidemiology  and  Microbiology,  AMN,  USSR, 
appearing  in  the  Russian-language  periodical  Zhurnal  Mikroblolo- 
gii,  Epidemiologli  1  Itnmunobiologi i  (Journal  of  Microbiology, 
Epidemiology  and  Immunobiology)  #9,  1964,  pages  60-64.  It 
was  submitted  on  27  Jan  1964.  Translation  performed  by  Sp/7 
Charles  T.  Ostertag  Jr.] 


The  theory  of  epidemiology  and  the  antiepidemic  practice  are 
promoting  the  necessity  for  developing  a  method  for  studying  the  relations 
which  emerge  during  the  development  of  the  epidemic  process.  For  obtaining 
objective  criteria  for  evaluating  the  relationship  it  is  necessary  to 
apply  the  statistical  method,  which,  up  until  the  present  time,  has  not  yet 
occupied  a  proper  place  in  epidemiological  investigations. 

In  becoming  acquainted  with  the  appropriate  literature,  we  were 
successful  in  encountering  several  works,  where  with  greater  or  leaser 
success  they  applied  the  statistical  analysis  of  the  relation  between 
phenomena. 

Vanskaya  (1947),  when  studying  the  number  of  musca  domestica  and, 
consequently,  morbidity  with  intestinal  infections  under  the  conditions 
of  urban  living,  made  use  of  calculating  the  correlation  coefficient, 
which  confirmed  the  presence  of  a  relationship  between  the  level  of  morbidity 
and  the  number  of  flies. 

Verbev  (1961),  when  studying  the  dependency  between  mass  immunization 
and  morbidity  with  an  inoculation  form  of  epidemic  hepatitis,  made  ex¬ 
tensive  use  of  determining  the  correlation  coefficient  between  epidemic 
hepatitis  incidence  and  the  percentage  of  persons  which  had  been  immunized 
against  typhoid,  smallpox  and  diphtheria.  The  results  obtained  made  it 
possible  to  confirm  the  hypothesis  put  forward  concerning  the  absence  of 
a  link  between  mass  immunization  and  morbidity  with  an  inoculation  form  of 
epidemic  hepatitis.  Abou-Greeb  (1960)  studied  the  dependency  of  cholera 
morbidity  in  Calcutta  on  the  density  of  the  population  and  other  factors. 

He  was  able  to  confirm  a  relation  in  the  level  of  morbidity  with  the 
material  well-being  of  the  population  and  the  organization  of  the  water 
supply  system  in  this  or  that  sector  of  the  city.  A  relationship  between 
cholera  morbidity  and  the  density  of  the  population  was  not  confirmed 
statistically. 


Preceding  Page  Blank 


1. 


At  firs  glance  lhe  results  of  comparing  morbidity  with  the 
density  of  the  population  suggests  doubts,  however,  even  here  an  analysis 
of  the  material  nature  of  the  phenomena  makes  it  possible  to  substantiate 
the  results  obtained  by  the  statistical  method.  In  Lhe  cited  work  Lhere  is 
talk  about  the  density  of^the  population,  that  is,  Lhe  number  of  persons 
living  in  an  area  of  1  km^- ,  which  in  our  opinion  can  in  no  way  be  identified 
with  the  concept  of  congestion,  which  is  actually  related  to  the  level 
of  incidence  with  various  infections.  As  an  example  it  is  possible  to  cite 
Belgium,  where,  with  a  very  high  population  density,  a  very  low  level  of 
infectious  morbidity  is  observed  (data  from  official  statistics). 

In  the  present  work  we  have  taken  on  the  goal  o£  acquainting  the 

readers  with  elementary  methods  of  studying  relationships.  To  readers 
desiring  a  deeper  acquaintanceship  with  this  method,  we  recommend  the 
following  textbooks:  B.  S.  Bessmertnyy  and  H.  N.  Tkacheva  "Statistical 
Methods  in  Epidemiology"  (1961),  P.  F.  Rozhitskiy  "Bases  of  Variation 
Statistics  for  Biologists"  (1961),  B.  Hill  "Bases  of  Medical  Statistics" 
(1958). 


In  an  objective  world  they  distinguish  two  forms  of  relationships. 
They  call  a  relation  functional  if  for  each  measure  of  one  phenomenon  there 
is  the  corresponding  strictly  specific  measure  of  another.  In  epidemiology 
and  branches  of  medicine  adjacent  to  it,  relations  persist  during  which 
changes  of  the  actuating  factor  and  the  results  of  its  Influence  may  not 
occur  parallel.  In  other  words,  an  intimate  relation  takes  place  when 
(following  numerous  observations)  a  certain  average  change  in  the  value  of 
one  corresponds  to  a  specific  change  of  the  other.  These  relations  are 
called  correlations  and  for  their  study  a  number  of  methods  are  used  which 
we  will  talk  about  below. 

A  determination  of  the  correlation  coefficient  serves  as  the  most 
general  statistical  method  for  measuring  such  relationships. 

We  applied  correlation  analysis  for  the  determination  of  the 
Influence  of  the  velue  of  the  seasonal  wave  on  the  level  of  annual  dysenterv 
morbidity  In  localities. 

We  determined  the  volume  of  the  seasonal  wave  with  the  help  of  the 

formula  q  q 

-  Ilzm  * 
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suggested  by  Khayfete  and  Khaeanov  (1959),  having  changed  somewhat  the 
authors'  suggested  interpretation  of  the  values  entering  into  the  formula: 
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-  -  the  value  of  the  true  seasonal  rise  (in  percentages)  -  -  tin. 
number  of  cases  in  a  year;  0  --  the  number  of  cases  registered  during  tin 
period  of  the  seasonal  risc;/^  --  the  number  of  months  of  the  seasonal 
rise,  determined  with  a  consideration  of  the  average  climatic  conditions 
of  the  territory  on  which  the  morbidity  analysis  is  being  carried  out. 


are 


Values,  characterizing  changes  in  the  criteria  being  compared, 
arranged  in  the  form  of  two  series.  Series^^  is  morbiditv  with  acute 
dysentery  in  conditional  indices  and  scries  y  is  the  valuejj  in  per¬ 
centages.  For  each  series  of  numbers  the  average  value  is  determined,  then 
for  each  number  its  deviation  from  the  series  average  is  established.  The 
deviations  obtained  are  fabricated  into  a  square  and  added  up.  Twin 
deviations  of  both  series  are  multiplied  by  each  other  and  the  products 
also  added.  The  correlation  coefficient  is  computed  according  to  the 
quoted  formula,  where  is  the  correlation  coef  f  icient  ,iC-^andVy  - 
the  deviations  of  the  variant  from  the  average  of  the  series, 
addition  symbol.  The  correlation  coefficient  obtained,  equal  to  0.86, 
characterizing  the  relation  between  the  seasonal  rise  of  morbidity  with 
this  infection  in  the  RSFSR  testifies  to  the  presence  of  a  v°.ry  intimate 
relation  between  these  phenomena  (table  1). 

It  must  be  stressed  that  the  correlation  coefficient,  just  as 
other  criteria  for  measuring  the  intimacy  of  a  relation,  serves  only  as 
a  confirmation  of  the  hypothesis  advanced  by  researchers  concerning  the 
presence  or  absence  of  a  relation  between  the  phenomena  being  compared. 

At  the  same  time,  a  considerable  magnitude  of  the  correlation  coefficient 
during  an  analysis  of  the  values  of  two,  it  would  seem,  non-related  features, 
may  compel  researchers  to  more  deeply  analyse  the  phenomena  under  study 
end  lead  to  the  concept  concerning  the  presence  of  material  foundations 
for  the  existence  of  a  relation  between  than. 

The  epidemiologist  often  has  to  investigate  phenomena,  when  the 
variation  is  exhausted  by  two  incompatible  possibilities  (alternative 
variation).  This  take*  place,  for  example,  during  en  Investigation  of  the 
effectlveneee  of  inoculations  (inoculated  --  not  Inoculated,  became  ill  -- 
did  not  becoatt  111).  Measurement  of  the  relation  in  these  cases  is  conducted 
according  to  the  so-called  method  of  constructing  a  four  symbol  table, 
by  meant  of  the  subsequent  calculation  of  the  association  coefficient.  An 
example,  illustrating  the  method  of  determining  the  association  coefficient, 
is  taken  from  the  current  work  of  our  laboratory,  wKch  at  the  present 
time  is  engaged  In  a  study  of  the  epidemiological  role  of  various  mani- 
fes tat ions  of  diphtheria  infection  and,  In  particular,  a  study  of  the  roles 
characterising  the  ability  to  carry  the  diphtheria  bacillus.  One  of  the 
trends  in  the  investigation  was  the  study  of  the  relation  of  the  non* 

•pacific  resistance  of  the  children  investigated  end  the  ability  to  carry 
the  diphtheria  bacillus. 
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As  a  conditional  test  we  utilized  the  Nesterov  test  characterizing 
the  C-vitamin  saturation  of  the  human  organism.  In  our  work,  which  was 
carried  out  earlier  (Zenkevich  and  Tkacheva,  1960),  we  presented  a  de¬ 
tailed  characterization  of  the  Nesterov  test  as  an  index  of  nonspecific 
resistance,  resting  on  a  considerable  number  of  observations. 


The  data,  obtained  during  the  investigation  of  children  from  one  of 
the  children's  homes  were  arranged  by  us  in  the  form  of  a  four  symbol 
table,  where  --  number  of  carriers,  possessing  a  sufficient  level  of 
nonspecific  resistance;  4*  --  number  of  children,  not  yielding  the  diptheria 
bacillus  but  having  a  sufficient  level  of  nonspecific  resistance;  £*  -- 
number  of  carriers  among  the  children  with  a  lowered  nonspecific  resistance; 

A  --  number  of  children  with  a  lowered  nonspecific  resistance,  in  which 
the  diphtheria  bacillus  was  not  isolated  (table  2). 


After  substituting  our  specific  data  into  the  formula  we  obtain 
the  association  coefficient,  equal  to 


n-  _  -  is* 


The  minus  sign  demonstrated  the  reverse  relation  (the  more  children  there 
were  in  the  collective  who  had  an  adequate  nonspecific  resistance,  the 
less  carriers  there  were  in  it).  The  value  0.86  characterizes  the  high 
degree  of  intimacy  in  the  relationship  between  nonspecific  resistance 
and  the  ability  to  carry  the  diphtheria  bacillus. 


The  association  coefficient  is  used  solely  for  measuring  the  in¬ 
timacy  of  the  relation  between  criteria  characterizing  qualitative  indices. 

A  very  simple  method  is  known  for  measuring  the  intimacy  of  relation¬ 
ships.  It  is  applicable  for  analyzing  the  relationships  between  quantitative 
and  also  between  qualitative  criteria.  This  is  the  method  of  determining 
the  coefficient  of  rank  correlation. 

We  present  the  following  example  as  an  illustration.  It  is  commonly 
known  that  in  cities  the  indices  of  morbidity  with  acute  dysentery  are 
higher  than  in  rural  areas.  Since  the  oblasts  have  a  various  percentage 
ratio  of  urban  and  rural  population,  it  may  be  proposed  that  a  relation¬ 
ship  mists  batmen  the  proportion  of  the  urban  population  in  the  oblast 
and  the  incidence  of  acute  dysentery. 

We  will  introduce  two  series  of  values  --  the  proportion  of  the 
urban  population  and  morbidity  with  acute  dysentery  (table  3). 

We  will  arrange  the  oblasts  in  the  order  of  decreasing  morbidity 
indices.  In  this  way  each  oblast  receives  a  rank  (serial  frequency 
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number)  based  on  the  index  of  morbidity  and  on  the  proportion  of  the 
urban  population.  The  determination  of  the  coefficient  of  rank  correl  i- 
tion  steins  fr  <ra  the  diversity  of  ranks  for  each  of  the  two  designated 
criteria  for  each  oblast.  If  the  values  obtained  arc  substituted  in 
the  formula  presented  in  table  3,  then  p  amounts  to  0.51,  which  testi¬ 
fies  to  the  direct,  average  level  of  the  relationship  between  the  level 
of  dysentery  morbidity  and  the  proportion  of  the  urban  population.  In 
this  manner,  our  supposition  concerning  the  presence  of  a  relationship 
between  these  criteria  was  confirmed. 

The  methods  presented  and  the  examples  cited  concern  only  a  compari¬ 
son  of  the  two  factors  and  primarily  to  an  analysis  of  a  causative  relation¬ 
ship.  The  majority  of  phenomena  investigated  by  epidemiologists  are  found 
under  the  simultaneous  influence  of  many  factors.  As  a  result  a  new 
problem  emerges  --  to  attempt  to  break  up  this  multifactorial  complex 
and  qualitatively  characterize  the  values  of  each  of  the  factors  in  this 
complex. 

The  study  of  this  problem  should  begin  with  the  application  of 
the  method  of  fractional  correlation  and  dispersion  analysis  to  epi- 
demiological  phenomena.  The  set  up  for  carrying  out  dispersion  analysis 
is  quite  complex  and  it  Is  impossible  to  present  it  in  a  summary.  We 
used  the  method  of  dispersion  analysis  in  its  most  elementary  form  when 
studying  the  problem  concerning  to  what  degree  the  Nesterov  test  may  be 
used  as  a  conditional  test  for  the  characteristics  of  nonspecific  re¬ 
sistance.  The  results  obtained  showed  that  the  dominating  Influence  on 
the  degree  of  nonspecific  resistance  proves  to  be  the  endurance  by  a 
child  of  an  Infectious  or  catarrhal  disease  not  long  before  the  investigut ion 
was  set  up.  Living  conditions  exerted  three  times  less  influence,  and 
accidental  factors,  unaccounted  for  by  us,  were  12  times  weaker  than  en¬ 
durance  of  the  disease. 

We  know  of  the  successful  application  of  this  method  in  the  works 
of  Zatsepin  (1939)  and  Akatov  and  Lebedeva  (1961). 

It  is  obvious  that  the  determination  of  the  place  of  this  method 
in  epidemiological  Investigations  has  been  presented  insufficiently.  This 
method  can  receive  acceptance  only  as  the  result  of  its  wide  application 
in  research. 

The  numerical  expression  of  the  degree  of  common  variability  of 
phenomena  imparts  great  definiteness  to  conclusions  and  Judgments  obtained 
as  a  result  of  investigations.  It  becomes  possible  to  compare  the  de¬ 
grees  of  intimacy  in  the  relationships  of  die  phenomenon  being  studied 
with  various  factors.  This  in  turn  makes  It  possible  to  determine  the 
factor  exerting  the  dominating  Influence  on  the  crlterlum  being  in¬ 
vestigated. 

The  methodological  nature  of  the  article  does  not  demand  special 
conclusions. 
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Processing  the  results  from  the  investigation  of  children 
from  one  children' s  home  with  the  help  of  the  four  symbol 
table 
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Relation  between  morbidity  with  acute  dysentery  and  the  make  up  of  the  population  in  the  city 
and  the  country  (data  of  1960  based  on  a  selected  circle  of  oblasts  in  the  RSFSR) 


